4,209 research outputs found
Self-Replicating Machines in Continuous Space with Virtual Physics
JohnnyVon is an implementation of self-replicating machines in
continuous two-dimensional space. Two types of particles drift
about in a virtual liquid. The particles are automata with
discrete internal states but continuous external relationships.
Their internal states are governed by finite state machines but
their external relationships are governed by a simulated physics
that includes Brownian motion, viscosity, and spring-like attractive
and repulsive forces. The particles can be assembled into patterns
that can encode arbitrary strings of bits. We demonstrate that, if
an arbitrary "seed" pattern is put in a "soup" of separate individual
particles, the pattern will replicate by assembling the individual
particles into copies of itself. We also show that, given sufficient
time, a soup of separate individual particles will eventually
spontaneously form self-replicating patterns. We discuss the implications
of JohnnyVon for research in nanotechnology, theoretical biology, and
artificial life
Distributional semantics beyond words: Supervised learning of analogy and paraphrase
There have been several efforts to extend distributional semantics beyond
individual words, to measure the similarity of word pairs, phrases, and
sentences (briefly, tuples; ordered sets of words, contiguous or
noncontiguous). One way to extend beyond words is to compare two tuples using a
function that combines pairwise similarities between the component words in the
tuples. A strength of this approach is that it works with both relational
similarity (analogy) and compositional similarity (paraphrase). However, past
work required hand-coding the combination function for different tasks. The
main contribution of this paper is that combination functions are generated by
supervised learning. We achieve state-of-the-art results in measuring
relational similarity between word pairs (SAT analogies and SemEval~2012 Task
2) and measuring compositional similarity between noun-modifier phrases and
unigrams (multiple-choice paraphrase questions)
Low Size-Complexity Inductive Logic Programming: The East-West Challenge Considered as a Problem in Cost-Sensitive Classification
The Inductive Logic Programming community has considered
proof-complexity and model-complexity, but, until recently,
size-complexity has received little attention. Recently a
challenge was issued "to the international computing community"
to discover low size-complexity Prolog programs for classifying
trains. The challenge was based on a problem first proposed by
Ryszard Michalski, 20 years ago. We interpreted the challenge
as a problem in cost-sensitive classification and we applied a
recently developed cost-sensitive classifier to the competition.
Our algorithm was relatively successful (we won a prize). This
paper presents our algorithm and analyzes the results of the
competition
A Uniform Approach to Analogies, Synonyms, Antonyms, and Associations
Recognizing analogies, synonyms, antonyms, and associations appear to be four\ud
distinct tasks, requiring distinct NLP algorithms. In the past, the four\ud
tasks have been treated independently, using a wide variety of algorithms.\ud
These four semantic classes, however, are a tiny sample of the full\ud
range of semantic phenomena, and we cannot afford to create ad hoc algorithms\ud
for each semantic phenomenon; we need to seek a unified approach.\ud
We propose to subsume a broad range of phenomena under analogies.\ud
To limit the scope of this paper, we restrict our attention to the subsumption\ud
of synonyms, antonyms, and associations. We introduce a supervised corpus-based\ud
machine learning algorithm for classifying analogous word pairs, and we\ud
show that it can solve multiple-choice SAT analogy questions, TOEFL\ud
synonym questions, ESL synonym-antonym questions, and similar-associated-both\ud
questions from cognitive psychology
Similarity of Semantic Relations
There are at least two kinds of similarity. Relational similarity is
correspondence between relations, in contrast with attributional similarity,
which is correspondence between attributes. When two words have a high
degree of attributional similarity, we call them synonyms. When two pairs
of words have a high degree of relational similarity, we say that their
relations are analogous. For example, the word pair mason:stone is analogous
to the pair carpenter:wood. This paper introduces Latent Relational Analysis (LRA),
a method for measuring relational similarity. LRA has potential applications in many
areas, including information extraction, word sense disambiguation,
and information retrieval. Recently the Vector Space Model (VSM) of information
retrieval has been adapted to measuring relational similarity,
achieving a score of 47% on a collection of 374 college-level multiple-choice
word analogy questions. In the VSM approach, the relation between a pair of words is
characterized by a vector of frequencies of predefined patterns in a large corpus.
LRA extends the VSM approach in three ways: (1) the patterns are derived automatically
from the corpus, (2) the Singular Value Decomposition (SVD) is used to smooth the frequency
data, and (3) automatically generated synonyms are used to explore variations of the
word pairs. LRA achieves 56% on the 374 analogy questions, statistically equivalent to the
average human score of 57%. On the related problem of classifying semantic relations, LRA
achieves similar gains over the VSM
Generating Music from Literature
We present a system, TransProse, that automatically generates musical pieces
from text. TransProse uses known relations between elements of music such as
tempo and scale, and the emotions they evoke. Further, it uses a novel
mechanism to determine sequences of notes that capture the emotional activity
in the text. The work has applications in information visualization, in
creating audio-visual e-books, and in developing music apps
Word Sense Disambiguation by Web Mining for Word Co-occurrence Probabilities
This paper describes the National Research Council (NRC)
Word Sense Disambiguation (WSD) system, as applied to the
English Lexical Sample (ELS) task in Senseval-3. The NRC system
approaches WSD as a classical supervised machine learning problem,
using familiar tools such as the Weka machine learning software
and Brill's rule-based part-of-speech tagger. Head words are
represented as feature vectors with several hundred features.
Approximately half of the features are syntactic and the other
half are semantic. The main novelty in the system is the method for
generating the semantic features, based on word co-occurrence
probabilities. The probabilities are estimated using
the Waterloo MultiText System with a corpus of about one terabyte of
unlabeled text, collected by a web crawler
- …